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Abstract 



The classic N p chart gives a signal if the number of successes in a sequence of inde- 
pendent binary variables exceeds a control limit. Motivated by engineering applications 
in industrial image processing and, to some extent, financial statistics, we study a simple 
modification of this chart, which uses only the most recent observations. Our aim is to 
jy-j i construct a control chart for detecting a shift of an unknown size, allowing for an unknown 

distribution of the error terms. Simulation studies indicate that the proposed chart is su- 
perior in terms of out-of-control average run length, when one is interest in the detection 
of very small shifts. We provide a (functional) central limit theorem under a change-point 
model with local alternatives which explains that unexpected and interesting behavior. 
Since real observations are often not independent, the question arises whether these re- 
sults still hold true for the dependent case. Indeed, our asymptotic results work under 
■ the fairly general condition that the observations form a martingale difference array. This 

enlarges the applicability of our results considerably, firstly, to a large class time series 
models, and, secondly, to locally dependent image data, as we demonstrate by an example. 

X 1 MSC 2000: Primary 62L10, 60F17, 62G20; Secondary 62P30, 68U10, 62P05. 

1 Introduction 

Detection of changes in the mean characteristic of produced items is still the most fre- 
quently used tool in quality control. A large variety of cont rol charts have been p r opose d 



in the last fifty years. For comprehensive r eviews we refer to Antoch a nd Jaruskova (2002), 



Antoch. Huskova M.. and Jaruskoval (|2002l ). the monograph lBrodskv and Darkhovskvl (|2000i ) 



1 Address of correspondence: Prof. Dr. A. Steland, RWTH Aachen University, Institute of Statistics, 
Wiillnerstr. 3, D-52056 Aachen, Germany. 
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and also to the art icles IWoodalll (|1997l ). IChakraborti. van der Laan. and Bakirl (|200ll ). and 
Montgomery! (|200ll ) . Investigations of their properties indicate that one can not hope to select 
one "universally good" chart, which is uniformly sensitive to small, moderate and large shifts 
in the mean and still robust against violating the normality of errors assumption. On the 
other hand, a wide accessability of computer systems allows to run simultaneously several 
control charts with different sensitivity ranges for the same process. It is well known, the 
Shewart chart is well tuned to detect rather quickly large shifts, while EWMA and CUSUM 
charts are faster in detecting smaller shifts of the order 0.5a. If the aim is to detect moderate 
to large jumps so called jump-preserving procedures are attractive, which are special cases 
of the unifying vertica lly w eighted regression approach studied bv IPawlak and Rafajlowicz 
(|l999h . Istelandl JgOoJ, and IPawlak. Rafaiiowicz. and Stelandl (|200j|, |2008|)- Nonparametric 
kernel control charts and the optimiza t ion fo r cer tain out - of-con trol mode ls covering mixing 
proces ses have been studied in lStelandl (|2004l ) and IStelandl (|2005l ). Further. IWu and Spedding 
(2000) combined a classic Shewhart chart and a conforming run length chart yielding smaller 
ARLs for shifts larger than 0.8cr, but that method is inferior to the EWMA chart for smaller 
shifts. 

The purpose of this paper is to propose a new binary chart, which is easy to apply, has enlarged 
sensitivity to very small shifts, and is robust with respect to deviations from normality. We 
provide a comprehensive study covering the methodology, asymptotic theory, practical issues 
of control chart design, and extensive Monte Carlo simulations. 

Our study is motivated as follows: Although computing power has considerably increased, 
many practical applications still require detection procedures which are extremely fast to 
calculate. An exampl e, which motivated our investigation, is the surveillance of copper pro- 
duction as outlined in IPawlak. Rafaiiowicz. and Stelandl (|20Q§). Here the problem is to detect 



defects and cracks resulting in lower quality. The copper is surveyed by a camera taking many 
high-resolution images per second, and each column of an image is analyzed in real time to 
detect defects. Only detectors which are fast enough to calculate can be employed. In such 
engineering image processing and image analysis applications one has to deal with the spatial 
inhomogeneity of the grey level of pixels. One can either assume that the inhomogeneity is 
compensated by a quite wiggly mean function which is disturbed by independent noise, or 
assume a smooth mean function overlayed by dependent noise. In the latter case fitting com- 
plex models to take account of dependencies is often not feasible in real-time applications. 
Then it is important to know how the chosen method behaves for dependent data. Let us also 
mention a further important area, namely the application of monitoring procedures to finan- 
cial data. In financial statistics various empirical analyzes have revealed that asset returns are 
usually uncorrelated but the squares are serially correlated and are affected by conditional 
heteroscedasticity which produces the clusters of strongly dispersed returns seen in real data. 
Various models for returns assume or imply the martingale difference property. 
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Having in mind the above applications, we propose a simple method where one thresholds 
the observations to obtain binary data and applies a control chart based on the number of 
data points exceeding the threshold. In contrast to the classic iVp-chart, the chart uses a 
finite buffer storing only the most recent observations. Our simulation results indicate that 
such a modified p-chart with a reduced n umber of observations reacts on average slower than 
several control charts studied recently in lHan and Tsund (J200J) for shifts larger than 0.25cr, 
but provides faster detection for very small shifts. 

We provide an appropriate theoretical framework and prove a functional central limit theorem 
which shows that the classic N p chart's sensitivity with respect to very small shifts indeed can 
be improved by taking less observations into account. As argued above, the question arises, 
whether the result still holds true when the independence assumption underlying the classic p 
chart is dropped. The answer is positive: Our main result and its interpretation holds true for 
a large class of dependent processes, namely the class of triangular arrays of random variables 
forming a martingale difference array with respect to some filtration. Thus, the benefits of 
the modified p chart are also effective when monitoring dependent data. 

The paper is organized as follows. In Section [2j we introduce the proposed control chart 
and its relationship to the classic Np-ch&rt. An appropriate change-point model with local 
alternatives is introduced in Section [3] to study the problem from an asymptotic viewpoint. 
We establish a functional central limit theorem for the underlying stochastic process which 
induces the stopping time of interest. A proof of the main result is postponed to an appendix. 
Practical issues of control chart design are discussed in detail in Section [H Finally, an extensive 
Monte Carlo study is presented in Section [5] providing a comparison with recently proposed 
control charts. 



2 Statistical model and a modified p-chart 

Our aim is to construct a control chart for detecting a shift of an unknown size m allowing 
for an unknown distribution F of the error terms. It is required that the in-control average 
run length (in-control ARL) of the chart can be tuned to sufficiently large values in order to 
reduce the number of false alarms. Simultaneously, the out-of-control ARL should be small, 
leading to quick detection of the jump after its occurrence. For a discussio n of the design 



of con trol limits and their relationship to alarm rates and ARLs we refer to iMargavio et al 



4l995|). 

Even if the underlying distribution is normal, the Shewhart control chart is not powerful 
for detecting small changes, say m of the order of 0.1a to 0.25cr, if a denotes the standard 
deviation of the errors. The EWMA (exponentially weighted moving average) control chart 
is better suited to this purpose, but its performance is still not satisfactory in the range of 
very small shifts. For this reason a number of modifications of the Shewhart, EWMA, and 
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CUSUM charts have been proposed recently (see lHan and Tsuna (|2004h and the bibliography 
cited therein). However, the design of a concrete control procedure with specific properties 
requires knowledge of the error distribution. 



2.1 Change-point model 

In this paper, we consider a classic change-point model, where the observations are of the 
form 

Y n =Y+ m ■ l(n - q) + £ n , n = l,2,... (1) 

Y denotes the desired level of quality (target value) which is disturbed by random errors e n 's. 
The deterioration of quality is modelled by jump (permanent shift in the quality characteristic) 
of height which appears at time instant q > 0. q is called change-point and is assumed 

to be non-stochastic but unknown. l(t) denotes the indicator function on the set [0,oo), i.e., 

f if t < 

Thus, starting at the change-point q there is a jump of height m. In Section [3] we consider a 
change-point model allowing for jump sizes tending to at a certain rate. 

To simplify the exposition, we shall assume Y = in what follows. For the same reason, let us 
tentatively assume that the error terms e n in (fT]) are independent and identically distributed 
random variables. That assumption will be relaxed in the next section. Whereas classic proce- 
dures are restricted to normally distributed noise, we allow for arbitrary distribution functions 
F which are symmetric about 0, i.e., 

F(x) = l-F(-x), x£l. (3) 

Particularly, we allow for distributions having no finite expectations, e.g., the Cauchy distribu- 
tion which has heavier tails than the normal distribution, or the Laplace (double exponential) 
law with lighter tails. Note that we do not require the error terms to possess a density /, but 
if they do, ([3]) implies f(x) = f(—x). 



2.2 The binary control chart revisited 

The classic nonparametric iVp-chart is distribution- free under quite general assumptions, 
and therefore is applicable when the error distribution is unknown. Although we confine our 
discussion to the case that the change from the in-control to the out-of-control scenario is 
given by a sharp jump, our approach can also be used for more general scenarios, because 
the construction of the control chart does not require knowledge of the underlying error 
distribution. As we shall see below, the chart proposed in this article provides noticeably 
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smaller out-of-control ARL than the classical and recently proposed control charts, but only 
for very small shifts, which are of the order 0.1-0.25 standard deviation - or its equivalent, 
based on the interquartile range, if the variance does not exists. A large number of theoretical 
investigations and computer simulations are witness of the fact that one can not expect 
existence of one "universal" chart with best performance in the whole range of shifts in the 
mean, if underlying distribution jump height are not specified. Prom this point of view, the 
binary chart occupies the region of small shifts. 

Let us briefly review the definition and basic properties of the classic A p-chart. Obviously, if 
the process ([I]) is in-control and ([3]) holds, then - roughly - half of the observations should be 
positive and the rest are expected to be negative. In other words, having A > 1 observations 

7 de f ■ iv\ / o i/ y„ < 

Z n = sign(Y n ) = I , n = l,2,...,N (4) 

^ 1 IJ Y n > U 

and introducing the counting random variable 

N 

I N d = card{Z< = 1, i = 1, 2, . . . , N} = ^ Z« (5) 

i=i 

we have E(ijv) = A/2, since In is a binomial random variable corresponding to A trials and 
success probability po = 1/2. Here and in the sequel E denotes the expectation. 

If a shift of size m occurred, then the distribution of subsequent Y n 's is no longer symmetric 
around zero and the probability of Z n = 1 changes to 

Pl = 1 - F(-m) (6) 

where p% can be larger or smaller than 1/2, depending on whether m is positive or negative. 
Summarizing, one can detect a shift m by testing the hypothesis Hq : po = 1/2 against the 
alternatives that the success probability in one trial is different than 1/2. 

If the process is in-control, the dispersion of the binomial r.v. 7/y equals -y/Apo (1 — po). 
Then, In /A has expectation pq and dispersion \/po (1 — po)/N. Approximating the binomial 
distribution by the corresponding normal law we arrive at the well known Ap-chart with 
upper control limit 

UCL = p + k y/po(l-Po)/N (7) 
and the lower control limit (LCL) 



LCL = p - k v / p (l-p )/N, (8) 

where k is selected according to required averaged run length (ARL) in-control, the standard 
choice being k = 3. If Jjv/A is outside the interval (LCL, UCL), then the out-of-control state 
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is claimed. Repeating the above reasoning, we can obtain the N po version of this chart with 
the following control limits for Ijv 



Np ± k y/Npo(l-po), 
where k is selected as above. For further discussions we refer to 



(9) 



Montgomery! (|200ll ). 



2.3 Modified p chart 

The above cha rt is the start i ng p oint for our modifications. They are necessary, since the 



classical chart (|Montgomervl . |2001| . pp. 284-294) is based on counting nonconforming items 
in samples of size N, which are either taken daily or at N consecutive days, if only one 
observation is available at each day. In the latter case, which is the setting we have in mind, 
the chart is applied only each Nth. time instance. This can yield substantially larger delays 
in detection. Obviously, such sampling schemes are not appropriate for our purposes. Thus, 
we shall modify the chart in such a way that it counts a fixed number, M > 1 say, previous 
individual observations Z n = 1 in a moving window. If the process is in-control, then we 
expect that about M/2 observations correspond to Z n = 1. 

More formally, we form a finite buffer of the length M, which contains only M past ob- 
servations, excluding the latest one Z n . M is called buffer length. When observation Z n is 
available, it replaces Z n —x, which is pushed to replace Z n _2 and so on. At each time instant 
n the present buffer contents is used to verify whether the process is in-control. To fix this 
idea, define the number of positive observations contained in the buffer in time n 

J n = card{Zi = 1, i = (n - 1), . . . , n - M} = ^ Z { . (10) 

i=n—M 

Note that the difficulty with an initial content of the buffer appears. The proposed modified 
p-chart is built on the assumption that historical pre-run data are available which are known 
to form a random sample of the in-control process. Thus, in the sequel we assume that at time 
n = the buffer contains past observations of the in-control process, which are numbered as 
Z—i, . . . , Z—jrf. Formally, we start the chart at n = 0, when the observation Zq arrives. Then, 
for n = 1, 2, ... it is verified whether the control statistic J n lies between the control limits 



UCL = Mpo + ky/Mpo(l-p ), (11) 

and 



LCL = Mp - ky/Mpotl-po). (12) 



Clearly, for p = l/2 these formulas simplify to UCL = M/2 + ky/M/2 and LCL = M/2 - 
kyM/2. If J n is smaller than LCLor larger than UCL, then out-of-control state is signaled. 
Note that the difference between UCLand LCLis constant for this chart. 
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The main difference between the proposed chart and the classical one can be summarized as 
follows. The classical N p chart is based on samples of size N from non- overlapping production 
intervals. In contrast, our chart counts events Z n = 1 in the buffer on length M, which is 
moving forward with n, in such a way that new observation Z n enters the buffer, while the 
oldest one is pushed out of it. In other words, the content of the buffer at time n and at time 
n + 1 highly overlap. 



3 Asymptotic results 

We will now present some asymptotic theory for the proposed procedure providing an ex- 
planation of the superiority of the modified p chart for small jumps. To simplify exposition, 
we slightly change the setting: We confine our study to a truncated version of the one-sided 
control chart which gives a signal if J n exceeds UCL for some 1 < n < N. H owever, our 
results can be extended to deal with the general case as outlined in 



Steland (2008). The small 



jump setting will be modelled by an appropriate asymptotic change-point model assuming a 
local alternative for the probabilities resp. jump heights. 

To simplify our exposition, we introduce a maximum sample size iV where monitoring stops 
in any case. Let us also rescale time by the transformation t t— >■ L-^J) t G [0, 1], where [^J 
denotes the largest integer smaller or equal to x, x G R. In the sequel, the current time point 
n will correspond to t, i.e., n = [Nt\ . 

Define the process 

LATtJ-l 

j N (t) = — ( z i~Po), te[(M + l)/N,l}. 

i=lNt]-M 

Note that J^{n/N) is equal to the statistic J n centered at its in-control expectation and 
scaled by N~ l l 2 . Now, the truncated version of the upper control chart of the last section, 
which gives a signal if J n exceeds UCL, corresponds to the stopping time 



S N = min{M + l<ra<iV:J n > Mp + ky/ Mp Q {l - po)}. 
We can represent Sjy via the process Jn^)- Indeed, we have 



S N = Nm£he[(M + l)/N,l]:J N (t)>k\J^po(l- Po )j, N > 1. (13) 

For the asymptotic framework in this section, let us assume that the buffer length, M, is 
chosen as a N- valued function of n = [Nt\ , i.e., M = Mi^i , satisfying the growth condition 

^p- -» M(t), (14) 
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as the maximum sample size N tends to oo. Here M : [0, 1] — > [0, 1] is a non-decreasing 
function which is continuous on (0, 1] with M(0) = 0. We will call M asymptotic buffer length 
(strategy). Condition (|14|) ensures that, asymptotically, the buffer length M is not too small 
compared to N. 

To ensure that the buffer is not longer than the available time series, we impose the following 
condition. 

Assumption (N): The buffer length strategy M : [0, 1] — >• [0, 1] satisfies the natural condition 

M(t)<t for alii G [0,1]. 

We shall show that under the following assumption the modified chart is superior to the 
classic one. 

Assumption (M): The buffer length strategy satisfies the modifier condition, if 



Let us now consider some examples. 

Example 3.1. Put M(0) = and M [iVt j = [S,tN\, t £ (0, 1], for some £ G (0, 1]. Obviously, 
the natural condition (N) is satisfied, iff. £ < 1. Particularly, the classic Np chart is given 
by M^ Nt j = [Nt\ , t £ [0, 1], thus corresponding to £ = 1 and M(t) = t, t £ [0, 1]. 

The following example considers the case that the buffer lengths M n are constant with respect 
to n. 

Example 3.2. Suppose -Mi/vtJ = for some constant rj £ (0,1]. Fort £ [0, rj/N] the avail- 
able data Yi, . . . , Y^ t ^ do not fill the buffer. One may assume that pre-run data Y_m+i, ■ ■ ■ ,Yq 
are available. However, to ensure a fair comparison with the classic Np chart, let us consider 
the choice 



yielding M(t) = min(t,ry). Now the modified chart does not require historical data at the 
beginning. It starts as the classic chart and is modified as time proceeds to catch small late 
changes better. 

Let us now consider an appropriate asymptotic change-point model for a small jump at 
location q. Assume that 



M(t) < t 



for all t £ (0, 1] 



(15) 




M [Nt] =min(LiVtJ, L^J) 




i < q = [N$\ 



(16) 
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for some constant i? G (0, 1) which specifies the fraction of the maximum sample size N where 
the jump occurs. We model the out-of-control probability p\ as a sequence of local alternatives 
given by 

Pi = Pni = Po + A/-//V, 

such that A = VW (p 1 - p ) > 0. 

Note that this model yields a triangular array of observations, 

Z m , l<i<N, N>1, 

where for each N the random variables Zni, ■ ■ ■ , Znn are independent with E(Z7Vi) = Po 
for 1 < i < q and E(Zjvi) = Pni for q < i < N. Below we shall drop the independence 
assumption. 

Remark 3.1. For our purposes it is appropriate to formulate the change-point model in 
terms of the probabilities po and p\, but let us briefly discuss how it relates to a model for 
the jump height m. Assume the underlying probability density f{x) is continuous and bounded 
in a neighborhood of 0. If we consider a local alternative model for the jump height where 
tun = A m /\/jY for a positive constant A m , fifty and the mean value theorem give 

Pi -Po = /(eJv)A m /v / iV 

for points £n between and A m /y/N. Thus, in this case 

Pi =Po + (/(0) + o(l))A m /v / iV. 



In the sequel, B(t), t G [0,1], denotes a standard Brownian motion with B(0) = 0, i.e., a 
centered Gaussian process with covariance function Cov(B(s), B(t)) = min(s,i), s,t G [0,1]. 
The process i7jv(*)> * e [0, 1] , is an element of the Skorohod space D[0, 1] of all functions 
/ : [0, 1] — > R which are right-continuous with existing limits from the left. We denote 
distributional convergence (weak convergence ) for a sequ ence \X. X n \ C D[0, 1] by X n X, 



as n — } oo. For details we refer to 



Billingslev! f|l991h and lShorackl |2000|). 



Our main result works under very general assumptions. Indeed, it just requires that the 



random variables Z^i — UNi form a martingale difference array with E(^LJJ r J 



Nil^Nj-l, 



fiNi for 



all i and r = 1, 2, for some filtration {J~Ni}- In this case, the expectation in (|16p is replaced by 
the conditional expectation E,(Z^i\ J-Ni-i)- Recall that an array {X n ^ m : 1 < m < n^^n > 1} 
of random variables defined on a common probability space P) is called martingale 

difference array with respect {J- n ,m}, if {J~n,m\ forms a filtration, i.e., 

T nfi = {0, fi}cJ n ,iC'"C F n ,n k C T, 

each X n;m is J-^m-measureable, and E(X njm | J nm _i) = for all 1 < m < and n > 1. 
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The martingale difference assumption is a natural approach to deal with time series. However, 
it is also suited and general enough to treat (locally) dependent image data, as demonstrated 
by the following example working with sliced rectangular neighborhoods. 

Example 3.3. (A Model for Locally Dependent Image Data) 

Suppose each column of an image consisting of I columns and J rows is analyzed from bottom 
to top. Assume the origin (0, 0) corresponds to the lower left corner and the pixels are denoted 
by €. X x J = {0, ... 1} x {0, ... , J} for integers I, J. Let : (i,j) G X x J} be an 
array of i.i.d. random variables with common d.f. F satisfying = and Var (Cfj) = 1 

for all (i, j) G X x J , representing the background noise of an image. For h > 1 define a sliced 
/i-neighborhood for the pixel (i,j) by 

Mij = {(k, I) G X x J : (k = i A I < j) V (1 < \i - k\ < h A I < j + h)} 

and denote by Sjj = {^i : (k,l) G Mij} the corresponding set of^i's. Mij is a rectangle with 
width 2h+l and height j+h, sliced along the line from (i,j) to (i,j+h). Then Ma C • • • C Mm, 
and consequently the family 

F i0 = {0, n}, Fij = a(Eij) = a(£ k i : (k, I) G Mij), 

defines a filtration. For what follows, notice that £jj is not an element of the set Let 
us now assume that the errors disturbing the true image are given by the model equations 

Eij = hijt^ij, (i, j) G X x , j = 2, . . . , J, 

for Fij-i-measureable random variables hij with existing second moments. Then h^ = 
for functions Hij . Obviously, is Fij -measureable and, since ^ is independent from the ran- 
dom variables of the set we have E(^ij\Fij-i) = E(£ij) = yielding 

K(e ij \Fi,j- 1 ) = hijE^ij) = 0. 

Thus, {e^ : (i,j) G X x J} is a martingale difference array, and {e^ : j G J} is a martingale 
difference sequence with respect to {Fij : j = 0, . . . , J} for each i G X. Since 

Var(e ij -|Ji J _i) = h\, 

hjj is the conditional variance given the neighboring pixels. Particularly, h\- may depend on 
the noise levels of these neighboring pixels. Recall that when the k th column is analyzed, Zj^i 
is given by Zj^i = l(Eki < 0) for i = 1, . . . , N = J . We have 

E(Z Ni \T k ,i-i) = ^(hikCik < 0|X M _i) = F(0/h tk ) = Po = 1/2. 

and Var (Zj^ilFk^-i) = p (l — Po)- Consequently, the random variables Z^i —po, i = 1, . . . ,N, 
also form a martingale difference array with respect to the filtration {F^i : i = 1,. . . ,N} with 
common conditional variance po(l — po). 
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We are now in a position to formulate our main result concerning the weak convergence of 
the process J7iv(i) and the corresponding central limit theorem for the modified chart. 



Theorem 3.1. Suppose (N) and that the random variables £* Ni = {ZNi—{iNi))/yJ /U/vi(l — UNi) 
form a martingale difference array with respect to some filtration Tm> such that 

^m\^N,i-i) = and Var (^| JV,i-i) = 1, 
for all 1 < i < N, N > 1. Then the following conclusions hold true. 

(i) If there is no change-point, the process J7jv converges weakly, 

J N (t) => T) [B(t) - B(t - M(t))], 

as N — )• oo, where 



lim Var( N-^Y^iZm -EiZi))) = Po (l - Po ) 

I— too \ L — ' I 

v i=l 7 



The normed stopping time converges in distribution, 

S N /N 4- t m 



where 



T M = wf{t G [0, 1] : B{t) - B{t - M(t)) > kyj M{t)} 
(ii) Under the local change-point model [TS\) , the process Jn converges weakly, 



J N {t)^J^(t) = { 



{ m [B(t)-B(t-M(t))], t<<&, 

7] [B{t) - B{t - M(t))] + (t- 0)A, ■& < t < + M(t), 
_ r}o[B(t) - B(t - M(t))\ + M(t)A, •& + M(t) < t, 



as N — > oo. The normed stopping time converges in distribution, 

S N /N A 7$ = inf{s G [0, 1] : J$ (s) > k^/M(s) m }. 

Remark 3.2. Notice that the standard i.i.d. setting, where it is assumed that Zni, . . . , Znn 
are independent and identically distributed Bernoulli variables with success probability P q, is 
covered as a special case. 

The above theorem says that, asymptotically, the control chart behaves as the stopping time 
tm which is driven by the stochastic process 

V(i) = B(t) - B(t- M{t)). 
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Notice that the one-dimensional marginals of V(i) are distributed as B(M{t)). Further, for 
s < t we have 



EV(s)V(t) 



0, s - M(s) < s < t - M(t) < t, 

s-t + M(t), s - M(s) <t - M(t) < s <t, 

M(s), t - M(t) < s - M(s) < s <t. 

For small values of |s— 1\, i.e., locally, the process V{t) behaves similar as the process B(M(t)), 
if M{t) is a smooth function. 

The above theoretic results explain the benefits from using the modified binary chart: Assume 
(M) and suppose a signal is given at time t G [&,'& + M(t)) where rj < (cf. Example 13.21 ) 
In this case 



Right before the threshold k is hit, the behavior of the random part of the left hand side can 
be approximated by the process B(M(t))/y/M(t), which has expectation 0, variance 1 for 
any function M(t), and covariance function 

^ mm(M(s),M(t)) 
y/M(s)M(t) 

For small values of \s — 1\ and smooth M{t) this is approximately a Brownian motion. Consider 
the drift term (t — &)A/(rioy/M(t)), which mainly yields the detection power. The modifier 
condition (M) ensures that the drift term is strictly larger than the drift term for the case 
M(t) = t corresponding to the classic N p chart. This explains the superior performance of 
the modified chart for small jumps. 

If the change was not detected until time •d + M(t), a signal is given if 



v(t) + VMm >k _ 



y/M(t) V0 

For the random part the same arguments as given above apply. But now under condition (M) 
the drift term is strictly smaller than the drift term for the case M(t) = t. We may summarize 
that the limit theorem indicates that the modified p chart is preferable to detect very small 
jumps right after the change-point. 

Also notice that Theorem 13.11 yields well defined limit distributions for small jumps of the 
order TV -1 / 2 . Clearly, for jumps of higher order, the drift diverges and dominates the random 
part, such that the beneficial effect of the function M(t) is not visible. 

4 Practical issues of control chart design 

Unlike the classic A^p-chart, the modified chart has two tunable parameters, namely, M and 
k, which should be carefully selected in order to ensure small out-of-control ARLs (average 



12 



run length to detection) under the constraint that the in-control ARL (average run length to 
false alarm) is not smaller than a given level. 

We will now summarize our experience on tuning this chart by simulations, which are justified 
to some extent by the theoretical results presented in the previous section. The major issue 
is how to select the control limit. 

(i) In practice, the 3a rule is often advocated, i.e., k = 3. However, this is not advisable 
here, since it leads to excessively long in-control ARLs. For our control chart, the in- 
control ARL also depends on the buffer length M. Selecting k = 2.34 and M = 9 we 
get first reasonable in-control ARL about 500. 

(ii) For a given buffer length the same in-control ARL is attained for k from a certain 
relatively long interval. This is due to the fact that J n is always an integer. 

(iii) Analysis of Figure [TJ where log of in-control ARL is plotted as a function of k for 
different buffer lengths, reveals that it is advisable to select k at the left end of that 
interval. That choice ensures the specified in-control ARL and minimizes the distance 
UCL-LCL. 

In view of these remarks we suggest the following practical approach to select the parameters 
M and k of the chart. 

1. Select a desired in-control ARL, e.g., equal to 370. 

2. Select the buffer length M > 1. A discussion on selecting M is presented below. 

3. For a practical application one may simulate the in-control ARL for k varying from 
1 to 3. It is not difficult to find a reasonable k in this way, but determining exactly 
the smallest k, which guarantees the specified in-control ARL is a computationally 
demanding task. 

For the reader's convenience Table [T] summarizes some pairs (M, k) with minimal k (accuracy 
0.01) ensuring an in-control ARL of approximately 435. Notice that in general the fact that 
J n is integer-valued prevents the construction of a control chart with in-control ARL being 
equal to the target in-control ARL. 

One may also select M to minimize the out-of-control ARL for a given jump height m. 
Figure [2] indicates that for jump heights m = 0.25, m = 0.5, and m = 0.75 there exist 
optimal buffer lengths M. The choices M = 71, M = 28, M = 23 are optimal for m = 0.25, 
m = 0.5, and m = 0.75, respectively, taking into account that the selection was made among 
a rather limited number of buffer lengths. Clearly, an exhaustive search may yield slightly 
better results. Note, however, that for m = 1 the plot is increasing and one might expect that 



13 



logARL 
6.5 



6.0 



5.5 



* * A 



1.6 1. 
▲ ▲▲▲▲▲ 



3.5 



O Q Q ■£ Q Q Q Q 



X X 



o o o o o 



▲ iL A A A 



4.5 



a M=20 
O M=30 
x M=40 



2.2 



2.4 



Figure 1: Dependence of the logarithm of the in-control ARL on the threshold k for differ- 
ent buffer sizes M. The results were obtained for Gaussian N(0, 1) errors by averaging 10 
simulation runs. 
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Figure 2: Dependence of the out-of-control ARL as a function of the buffer size M for different 
jump heights m and normal errors. 



M = 


12 


23 


28 


71 


90 


150 


212 


441 


k = 


2.31 


2.30 


2.27 


2.02 


2.0 


1.8 


1.65 


1.39 


ARL = 


395 


415 


423 


411 


450 


452 


440 


456 



Table 1: Pairs of parameters (M, k) of the proposed chart ensuring an in-control ARL or the 
order 435. 



the best choice is for M < 12, but in this region one can not attain in-control ARL of order 
435. 



5 Simulation studies 



We performed extensive simulations aiming at the following issues. Firstly, we were interested 
in identifying pairs of the buffer length M and the threshold k ensuring a specified in-control 
ARL (at least approximately). Secondly, we investigated the out-of-control ARL for various 
jump heights, when the underlying observations are normally distributed. Third, we compared 
the binary chart with other charts for the case of normally distributed error terms, focusing 
on the out-of-control ARL as a performance measure. Finally, we studied the behavior of the 
out-of-control ARL for the binary chart when the errors are non-normal. 
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The simulation results are given in the tables below. All the results were obtained by averaging 
30000 simulation runs. Simulated jump occured at time zero and the buffer was fed up by in- 
control pre-run observations. The results of simulation studies can be summarized as follows. 
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(i) For Gaussian errors and an out-of-control ARL fixed at 435, our chart with buffer 
length M = 150 (see Table EJ) provides shorter out-of-control ARL's than CUSUM , 
Optimal EWMA, Shewhart-EWMA, GEWMA and GLR (see Irian and TsunJ (|20QJ ) 
for definitions), provided the jump is small. To be precise, the out-of-control ARL of 
our chart is about 243 for a jump m = 0.1 <7, and about 97 for m = 0.25 a, while for the 
above mentioned charts we have ARL's between 295 and 324 and between 105 and 110, 
respectively. Simultaneously, the dispersion of the RL time of our chart is considerably 
smaller and equals 172 for m = 0.1 a and about 59 for m = 0.25 a, while for the charts 



discussed in 



Han and Tsund (|2004l ) we have RL time dispersions of the orders 267-324 



and 79-102, respectively. 

(ii) Qualitatively the same pattern can be observed when the out-o f -contr ol ARL is fixed 
at 840 and errors are Gaussian (see Table 141 and lHan and Tsund (|2004l )). 



(iii) When the jump is larger than 0.5 a, the proposed chart is much slower than the above 
mentioned charts, but this shortcoming can easily be handled by applying several charts 
simultaneously and claiming an alarm when one of them gives a signal. 



The proposed chart retains its advantages in the range of small jumps when the errors are 
double exponentially distributed and even behaves quite well for difficult distributions 
as the Cauchy one (see Table [5]). 
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A Proof of the main result 

Under the change-point model of Section [3] we are given an array {Z^i : 1 < i < N,N >1} 
of Bernoulli variables with conditional expectations ~K{Z^i\J-M,i-i) = Po if 1 < i < [N-d\ , and 
E{Z Ni \T N ,i-i) = Pni =Po + A/VA 7 if [N$\ < i < N, N > 1. 
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Theorem A.l. (Durrett 2005, Theorem 7.3). Suppose {X nm } is a martingale difference 
array with respect to {J nm }. Define 



S n ,k — ^^n,i, Vn,k — ^ ^(^n,i\^n,i-l) i < k < 



n. 



i=l 



Ki<k 



If 



(%) V n \ n i\ —> t in probability for all t G [0, 1] and 

(ii) for all e > 0, Y,m< n E ( X l,m 1 {\x n , m \>e}\ j: 'n, m -i) ->• in probability, 
then S n i^j =^ -B(i), where B denotes a standard Brownian motion. 

Proof, (of Theorem 3.2) We first consider the case when there is no change. Let us introduce 
the partial sum process, 

Z N {t) = Y J im, *€[0,1], 
i=i 



where £jVi = {Zm — Po)/ y/NpoO- ~Po), 1 < i < N . Let us first verify that the array {^Ni '■ 
l<i<N,N>l} satisfies the assumptions of Theorem IA.1I Clearly, ^(^Aril^iV.i-i) = and 

E{i 2 Ni \F N ^{) = Var (^|^JV,i-i) = N' 1 , 



for all 1 < i < JV, yielding 



i=l 



[Nt\ 
N 



t, 



as N — > oo. The conditional Lindeberg condition is shown as follows. Since E((Zn% 
Po) 2 \J r N,i-i) < 1) 1 < i < iV, we obtain for any e > 



N 



i=i 

JV 



iV^ ^ (1 



(■^JVi - Po) 2 -. / l^iVi - Pol 



Po(l-Po) 1 VW - Po) 



> eViV 



< 



Np ( 



The conditional Markov inequality yields for 1 < i < N 



P I lf^-P°l > eVN 
V VPo(l - Po) 



iV,i-l 



< 
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which implies 

Hence, by Theorem lA.il 



lim Ln(s) = 0. 

7V->oo 



Z N =>- B, N -> oo. 
Now, as will be shown below for a more involved setting, 

1 . .Mi 



Z N (t--)-Z N (t 



\Nt\_ _ J_> 

TV' 



2V 



=>»to[B(t)-B(i-M(t))], 
as iV — >• oo. Having in mind the rule f)13[) . we conclude 

J N (t) - k^/ M [m N- l Po (l - p ) => 7? [-B(i) - 5(t - Af (*))] - k^/M(t) Vo 
which yields 



iV -> oo, 



S N /N A inf{s G (0, 1] : B(t) - B(t - M(t)) > ky/M(s)}, 

as N — > oo. 

To establish (ii), we consider three cases. 
Case 1: [Nt\ < [^^J is handled as above. 

Case 2: [Nfl] < [Nt\ < [N-d\ + MijvtJ- Denote the set of corresponding values of t by 72- 



Jjv(i) equals 

1 L^ -1 i LiVtJ_1 | - 1 

7^ ^ Zi ~ P ^ + ~fN ^ ( Z Ni-PNl) + ~7= ^ (PNl-Po)- (17) 



LJVtj-i 

£ < 

i=|_iVtf| 



Since pi — po = A/viV, the third term converges (pointwise) to the continuous function 
A(i — which implies that the convergence is also uniform in t G i9 + M(f)]. To handle 
the random terms put 



(Z< - Po)/^Po(l-Po)N, 0<i< [m\ - 1, 



(Z Ni - p N i)/y/pm(l-p N i)N, [N$\ <i<N. 



Again, the conditions of the functional martingale central limit theorem are satisfied, such 
that Zw(t) = Yl\=i^ ^Ni B(t). The first and second term in (fT7|) are now given by 

1 ~ M\ Nf \ 1 

Z N ($ -—)- Z N (t ^ - — 

1 N v N N 



+ vWi(i 
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which equals ip n (Zjv)(t), if we define the sequence of functionals tp n : (D[0,l},d) — > (D[0, l],eZ), 
N > 1, by 

M, 



<PN(z)(t) = VW ~Po) 



z(0 - 1/iV) - z(/ 



+ a/pati(1 -pATl) 



z(t 



1. 



z(tf 



1 



AT' 



Also define 



¥>(*) = Vpo(1 ~ Po)[z(t) - z(t - M(t))}, z G C[0, 1]. 
By linearity, ifN is uniformly Lipschitz continuous, i.e., 



sup ||<£>jv(zi) - ^Ar(^2)||oo < L\\z\ - 22||oo, 
N>1 



for all zi, Z2 G £>[0, 1], where L = 2 sup Ar >i \/piVi(l — Pivi) < 00. Further, since any z G C[0, 1] 
is uniformly continuous, 



0, 



N ->■ 00. 



Let {z,zn} C D[Q, 1] be a sequence with ztv — > z G C[0, 1] in the Skorohod metric, which 
implies ||zjv — z||oo — ► 0. Apply the triangle inequality to obtain 

HVJVC^JV) - ^(-Z)||oo < || Viv(^Jv) - ^iv(^)||oo + \Wn{z) - <P(z)\\oo- 



The first term is bounded by L||ztv — z\ 



0, A" — > 00, and the second one tends to by the 



uniform Lipschitz continuity. For z G C[0, 1] we have ip(z)(t) = \/po(T— poj[z(t) — z(t— M(t))}. 
Due to the Shorohod/Dudley/Wichura representation theorem, Zn B, N — > 00, implies 
that there exists a probability space and equivalent version of Zjy and -B defined on that new 
space, which we again denote by Zn and B, such that \\Zj\j — B||oo — » 0, Af — > 00, a.s. The 
above arguments ensure that 

<p N (Z N )(t) => <p(B)(t) = Vo[B(t) - B(t - M(t))], 

as A" — )■ 00. 

Case 3: [N&\ + My Nt ^ < t is obvious. 

Putting things together yields the result for J^{t). Since the process J$ is a.s. continuous, 
we may further conclude that 



Sn/N A 7$ = inf{/ G [0, 1] : J$(t) > ky/M^r^}, 



as A" — > 00. 



□ 
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M= 12, k =2.31 


M= 23, k =2.3 


M= 28, k =2.27 


Jump 


ARL 


RL Disp, 


Jump 


ARL 


RL Disp, 


Jump 


ARL 


RL Disp, 





395.27 


171.09 





415.66 


181.42 





423.12 


185.75 


0.1 


328.33 


144.18 


0.1 


305.80 


133.14 


0.1 


303.43 


133.17 


0.25 


168.09 


72.47 


0.25 


131.89 


56.00 


0.25 


122.90 


51.72 


0.5 


58.65 


24.52 


0.5 


43.78 


17.08 


0.5 


41.66 


15.73 


0.75 


27.84 


10.91 


0.75 


23.76 


8.35 


0.75 


24.18 


8.27 


1 


17.51 


6.35 


1 


17.60 


5.76 


1 


18.53 


6.00 


1.25 


12.98 


4.41 


1.25 


14.99 


4.76 


1.25 


16.10 


5.12 


1.5 


10.96 


3.54 


1.5 


13.66 


4.31 


1.5 


14.76 


4.68 


1.75 


10.00 


3.14 


1.75 


12.88 


4.05 


1.75 


13.99 


4.42 


2 


9.46 


2.94 


2 


12.45 


3.91 


2 


13.54 


4.28 


2.25 


9.19 


2.84 


2.25 


12.26 


3.84 


2.25 


13.25 


4.18 


2.5 


9.09 


2.80 


2.5 


12.12 


3.80 


2.5 


13.14 


4.14 


2.75 


9.05 


2.79 


2.75 


12.04 


3.77 


2.75 


12.96 


4.09 


3 


9.01 


2.77 


3 


11.96 


3.75 


3 


13.09 


4.12 



Table 2: Binary chart applied to observations with Gaussian errors. Chart tuned to in-control 
ARL about 435. Short buffer length. 
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M= 71, k =2.02 


M= 150, k =1.8 


M= 212, k =1.65 


Jump 


ARL 


RL Disp, 


Jump 


ARL 


RL Disp, 


Jump 


ARL 


RL Disp, 





411.23 


301.39 





452.05 


337.19 





440.32 


334.70 


0.1 


254.91 


182.71 


0.1 


243.54 


172.58 


0.1 


234.27 


166.92 


0.25 


95.12 


60.68 


0.25 


97.58 


58.68 


0.25 


101.26 


60.62 


0.5 


43.03 


23.33 


0.5 


53.50 


29.52 


0.5 


56.87 


32.41 


0.75 


30.75 


16.08 


0.75 


38.80 


21.12 


0.75 


41.30 


23.18 


1 


25.23 


13.06 


1 


31.60 


17.07 


1 


33.77 


18.76 


1.25 


22.11 


11.39 


1.25 


27.71 


14.87 


1.25 


29.38 


16.24 


1.5 


20.28 


10.41 


1.5 


25.20 


13.50 


1.5 


26.92 


14.81 


1.75 


19.22 


9.83 


1.75 


23.82 


12.74 


1.75 


25.38 


13.93 


2 


18.61 


9.50 


2 


23.10 


12.30 


2 


24.57 


13.48 


2.25 


18.13 


9.25 


2.25 


22.64 


12.05 


2.25 


24.17 


13.19 


2.5 


17.91 


9.14 


2.5 


22.31 


11.86 


2.5 


23.76 


13.00 


2.75 


17.83 


9.08 


2.75 


22.17 


11.80 


2.75 


23.70 


12.95 


3 


17.69 


9.03 


3 


22.15 


11.77 


3 


23.70 


12.94 



Table 3: Binary chart applied to observations with Gaussian errors. Chart tuned to in-control 
ARL about 435. Moderate and long buffer length. 
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M= 111, k =2.19 


M= 131, k =1,84 


M= 453, k =1.35 


Jump 


ARL 


RL Disp, 


Jump 


ARL 


RL Disp, 


Jump 


ARL 


RL Disp, 





836.64 


370.04 





841.83 


370.73 





840.02 


650.13 


0.1 


398.04 


170.64 


0.1 


399.22 


171.86 


0.1 


337.79 


226.64 


0.25 


122.34 


46.80 


0.25 


124.06 


46.98 


0.25 


149.54 


87.46 


0.5 


57.56 


19.21 


0.5 


57.63 


19.18 


0.5 


82.94 


47.32 


0.75 


41.71 


13.77 


0.75 


42.10 


13.83 


0.75 


59.05 


33.34 


1 


34.13 


11.17 


1 


34.18 


11.20 


1 


48.22 


27.03 


1.25 


29.78 


9.72 


1.25 


29.60 


9.66 


1.25 


42.12 


23.43 


1.5 


27.12 


8.84 


1.5 


27.18 


8.84 


1.5 


38.14 


21.23 


1.75 


25.66 


8.35 


1.75 


25.61 


8.32 


1.75 


36.05 


20.03 


2 


24.76 


8.04 


2 


24.66 


8.02 


2 


34.93 


19.38 


2.25 


24.24 


7.88 


2.25 


24.21 


7.87 


2.25 


34.33 


18.98 


2.5 


24.00 


7.78 


2.5 


23.96 


7.77 


2.5 


33.79 


18.71 


2.75 


23.97 


7.78 


2.75 


23.78 


7.74 


2.75 


33.36 


18.48 


3 


23.79 


7.73 


3 


23.69 


7.70 


3 


33.55 


18.57 



Table 4: Binary chart applied to observations with Gaussian errors. Chart tuned to in-control 
ARL about 840. Moderate and long buffer length. 
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Laplace (DblExp) 


Cauchy 


M= 40, k =2.22 


M= 28, k =2.28 


111 TT1 T\ 
O U.111JJ 


ART, 

T\. _L V, J j 


RL Disn 


In in t\ 


ART, 

il _L V X_i 


RL Disn 


n 


437 69 


315 91 


n 


420 79 


300 12 


n 1 

w . x 


1Q1 35 


1 33 31 


1 

\J . X 


334 82 


240.04 




5Q 51 


37 02 


25 


167 28 


116.28 


n 5 


28 51 


1 5 09 

X 'J . V7Z. 


5 


64 17 


41 76 


o 7^ 


99 D4 


1113 


7^ 


37 ^9 


99 47 




19 33 


9 69 


1 


27 27 


15.21 


1.25 


17.70 


8.84 


1.25 


22.70 


12.03 


1.5 


16.85 


8.37 


1.5 


20.51 


10.58 


1.75 


16.15 


8.02 


1.75 


18.86 


9.55 


2 


15.78 


7.83 


2 


17.93 


9.00 


2.25 


15.55 


7.69 


2.25 


17.23 


8.58 


2.5 


15.34 


7.59 


2.5 


16.67 


8.28 


2.75 


15.19 


7.52 


2.75 


16.29 


8.07 


3 


15.07 


7.46 


3 


15.98 


7.90 



Table 5: Comparison of ARLs of the binary chart with in-control ARL 435 when applied to 
non-Gaussian distributions. 30, 000 independent simulation trials. 
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